12 research outputs found

    The Treachery of Images: Bayesian Scene Keypoints for Deep Policy Learning in Robotic Manipulation

    Full text link
    In policy learning for robotic manipulation, sample efficiency is of paramount importance. Thus, learning and extracting more compact representations from camera observations is a promising avenue. However, current methods often assume full observability of the scene and struggle with scale invariance. In many tasks and settings, this assumption does not hold as objects in the scene are often occluded or lie outside the field of view of the camera, rendering the camera observation ambiguous with regard to their location. To tackle this problem, we present BASK, a Bayesian approach to tracking scale-invariant keypoints over time. Our approach successfully resolves inherent ambiguities in images, enabling keypoint tracking on symmetrical objects and occluded and out-of-view objects. We employ our method to learn challenging multi-object robot manipulation tasks from wrist camera observations and demonstrate superior utility for policy learning compared to other representation learning techniques. Furthermore, we show outstanding robustness towards disturbances such as clutter, occlusions, and noisy depth measurements, as well as generalization to unseen objects both in simulation and real-world robotic experiments

    Interactive Imitation Learning in Robotics: A Survey

    Full text link
    Interactive Imitation Learning (IIL) is a branch of Imitation Learning (IL) where human feedback is provided intermittently during robot execution allowing an online improvement of the robot's behavior. In recent years, IIL has increasingly started to carve out its own space as a promising data-driven alternative for solving complex robotic tasks. The advantages of IIL are its data-efficient, as the human feedback guides the robot directly towards an improved behavior, and its robustness, as the distribution mismatch between the teacher and learner trajectories is minimized by providing feedback directly over the learner's trajectories. Nevertheless, despite the opportunities that IIL presents, its terminology, structure, and applicability are not clear nor unified in the literature, slowing down its development and, therefore, the research of innovative formulations and discoveries. In this article, we attempt to facilitate research in IIL and lower entry barriers for new practitioners by providing a survey of the field that unifies and structures it. In addition, we aim to raise awareness of its potential, what has been accomplished and what are still open research questions. We organize the most relevant works in IIL in terms of human-robot interaction (i.e., types of feedback), interfaces (i.e., means of providing feedback), learning (i.e., models learned from feedback and function approximators), user experience (i.e., human perception about the learning process), applications, and benchmarks. Furthermore, we analyze similarities and differences between IIL and RL, providing a discussion on how the concepts offline, online, off-policy and on-policy learning should be transferred to IIL from the RL literature. We particularly focus on robotic applications in the real world and discuss their implications, limitations, and promising future areas of research

    Learning from Simulation, Racing in Reality: Sim2Real Methods for Autonomous Racing

    No full text
    Reinforcement Learning (RL) methods have been successfully demonstrated in robotic tasks, however, their application to continuous state-action systems with fast dynamics is challenging. In this work, we investigate RL solutions for the autonomous racing problem on the ORCA miniature race car platform. When training a deep neural network policy using RL methods only using simulations, we observe poor performance, due to model mismatch also known as reality gap. We propose three different methods to reduce this gap, first we propose a policy regularization in the policy optimization step, second, we use model randomization. These two methods allow learning a policy that can race the car without any real environment interactions. Our third method improves this policy, by running the RL algorithm online while driving the car. The achieved performance on the ORCA platform is comparable to that achieved previously by a model-based controller, in terms of lap time, and improved with respect to track constraint violations

    Equivalent Lap Time Minimization Strategies for a Hybrid Electric Race Car

    No full text
    The powertrain of the Formula 1 car is composed of an electrically turbocharged internal combustion engine and an electric motor used for boosting and regenerative braking. The energy management system that controls this hybrid electric power unit strongly influences the achievable lap time, as well as the fuel and battery consumption. Therefore, it is important to design robust feedback control algorithms that can run on the ECU in compliance with the sporting regulations, and are able to follow lap time optimal strategies while properly reacting to external disturbances. In this paper, we design feedback control algorithms inspired by equivalent consumption minimization strategies (ECMS) that adapt the optimal control policy implemented on the car in real-time. This way, we are able to track energy management strategies computed offline in a lap time optimal way using three PID controllers. We validate the presented control structure with numerical simulations and compare it to a previously designed model predictive control scheme

    Learning from Simulation, Racing in Reality

    No full text
    We present a reinforcement learning-based solution to autonomously race on a miniature race car platform. We show that a policy that is trained purely in simulation using a relatively simple vehicle model, including model randomization, can be successfully transferred to the real robotic setup. We achieve this by using a novel policy output regularization approach and a lifted action space which enables smooth actions but still aggressive race car driving. We show that this regularized policy does outperform the Soft Actor Critic (SAC) baseline method, both in simulation and on the real car, but it is still outperformed by a Model Predictive Controller (MPC) state-of-the-art method. The refinement of the policy with three hours of real-world interaction data allows the reinforcement learning policy to achieve lap times similar to the MPC controller while reducing track constraint violations by 50%

    Gonadal Steroids and Sperm Quality in a Cohort of Relapsing Remitting Multiple Sclerosis: A Case-Control Study

    No full text
    Introduction: Evaluation of the hypothalamic-pituitary-testicular axis and sperm analyses are not a standard examination of patients with Relapsing-Remitting Multiple Sclerosis (RRMS). Methods: This is a prospective-case-controlled study. Patients, aged 18–55, with a confirmed diagnosis of RRMS, naïve to any DMT were enrolled. Controls were men with normal evaluation who acceded to the Andrology Center of Catania in a contemporary matched randomized fashion to the group of RRMS patients. The aim of the study is to evaluate gonadal steroids and sperm quality in men at the time of RRMS diagnosis and 12 months following the first disease modifying treatment (DMT). Results: Out of 41 patients with RRMS, 38 were included in the study (age 40.3 ± 12.3) to be compared with matched controls. Patients with RRMS showed no differences in gonadal steroids or sperm parameters, except for free testosterone (fT) plasma levels, which were lower in RRMS patients than controls (median 0.09 vs. 1.4, p < 0.0001). The correlation analyses, corrected for age and Body Mass Index, did not reveal any correlation between hormonal/sperm parameters and level of disability or disease activity at onset. Additionally, 12 months following the start of DMT, there were no differences in gonadal steroids and sperm quality compared to baseline. Conclusions: Results suggest that RRMS may not have an impact on fertility status but prospective long-term studies are needed
    corecore